STDFormer: Spatial-Temporal Motion Transformer for Multiple Object Tracking

نویسندگان

چکیده

Mainstream multi-object tracking methods exploit appearance information and/or motion to achieve interframe association. However, dealing with similar and occlusion is a challenge for information, while limited by linear assumptions prone failure in nonlinear patterns. In this work, we disregard clues propose pure tracker address the above issues. It dexterously utilizes Transformer estimate complex achieves high-performance low computing resources. Furthermore, contrastive learning introduced optimize feature representation robust Specifically, first long-range modeling capability of mine intention temporal decision spatial interaction introduce prior detection constrain range estimation. Then, as an auxiliary task extract reliable features compute affinity bidirectional matching improve computation distribution. addition, given that both tasks are dedicated narrowing embedding distance between tracked object features, design joint-motion-and-association framework unify two one optimization. The experimental results achieved three benchmark datasets, MOT17, MOT20 DanceTrack, verify effectiveness our proposed method. Compared state-of-the-art methods, STDFormer sets new on DanceTrack competitive performance MOT17 MOT20. This demonstrates advantage method handling associations under appearance, or motion. At same time, significant advantages over Transformer-based learning-based suggest direction application MOT. generalization unmanned aerial vehicle (UAV) videos, also evaluate VisDrone2019. show VisDrone2019, which proves it can handle small-scale UAV videos well. code available at https://github.com/Xiaotong-Zhu/STDFormer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatial-Temporal Reasoning Based on Object Motion

This paper describes the continuing development of a system for tracking multiple man made objects, (typically vehicles) moving in a natural open world scene, where the detected motion is used to construct a structural representation of the scene. The system assumes no a priori knowledge of any structure within the image, but begins building a map of the scene on a frame by frame basis. The map...

متن کامل

Self-motion impairs multiple-object tracking.

Investigations of multiple-object tracking aim to further our understanding of how people perform common activities such as driving in traffic. However, tracking tasks in the laboratory have overlooked a crucial component of much real-world object tracking: self-motion. We investigated the hypothesis that keeping track of one's own movement impairs the ability to keep track of other moving obje...

متن کامل

Multiple Object Tracking Using Local Motion Patterns

This paper presents an algorithm for multiple-object tracking without using object detection. We concentrate on creating long-term trajectories for unknown moving objects by using a model-free tracking algorithm. Each individual object is tracked by modeling the temporal relationship between sequentially occurring local motion patterns. The algorithm is based on shape and motion descriptors of ...

متن کامل

Conflicting motion information impairs multiple object tracking.

People can keep track of target objects as they move among identical distractors using only spatiotemporal information. We investigated whether or not participants use motion information during the moment-to-moment tracking of objects by adding motion to the texture of moving objects. The texture either remained static or moved relative to the object's direction of motion, either in the same di...

متن کامل

It is time to integrate: The temporal dynamics of object motion and texture motion integration in multiple object tracking

In multiple-object tracking, participants can track several moving objects among identical distractors. It has recently been shown that the human visual system uses motion information in order to keep track of targets (St. Clair et al., Journal of Vision, 10(4), 1-13). Texture on the surface of an object that moved in the opposite direction to the object itself impaired tracking performance. In...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Transactions on Circuits and Systems for Video Technology

سال: 2023

ISSN: ['1051-8215', '1558-2205']

DOI: https://doi.org/10.1109/tcsvt.2023.3263884